20 research outputs found

    Autonomic Database Management: State of the Art and Future Trends

    Get PDF
    In recent years, Database Management Systems (DBMS) have increased significantly in size and complexity, increasing the extent to which database administration is a time-consuming and expensive task. Database Administrator (DBA) expenses have become a significant part of the total cost of ownership. This results in the need to develop Autonomous Database Management systems (ADBMS) that would manage themselves without human intervention. Accordingly, this paper evaluates the current state of autonomous database systems and identifies gaps and challenges in the achievement of fully autonomic databases. In addition to highlighting technical challenges and gaps, we identify one human factor, gaining the trust of DBAs, as a major obstacle. Without human acceptance and trust, the goal of achieving fully autonomic databases cannot be realized

    Energy Consumption Prediction with Big Data: Balancing Prediction Accuracy and Computational Resources

    Get PDF
    In recent years, advances in sensor technologies and expansion of smart meters have resulted in massive growth of energy data sets. These Big Data have created new opportunities for energy prediction, but at the same time, they impose new challenges for traditional technologies. On the other hand, new approaches for handling and processing these Big Data have emerged, such as MapReduce, Spark, Storm, and Oxdata H2O. This paper explores how findings from machine learning with Big Data can benefit energy consumption prediction. An approach based on local learning with support vector regression (SVR) is presented. Although local learning itself is not a novel concept, it has great potential in the Big Data domain because it reduces computational complexity. The local SVR approach presented here is compared to traditional SVR and to deep neural networks with an H2O machine learning platform for Big Data. Local SVR outperformed both SVR and H2O deep learning in terms of prediction accuracy and computation time. Especially significant was the reduction in training time; local SVR training was an order of magnitude faster than SVR or H2O deep learning

    MLaaS: Machine Learning as a Service

    Get PDF
    The demand for knowledge extraction has been increasing. With the growing amount of data being generated by global data sources (e.g., social media and mobile apps) and the popularization of context-specific data (e.g., the Internet of Things), companies and researchers need to connect all these data and extract valuable information. Machine learning has been gaining much attention in data mining, leveraging the birth of new solutions. This paper proposes an architecture to create a flexible and scalable machine learning as a service. An open source solution was implemented and presented. As a case study, a forecast of electricity demand was generated using real-world sensor and weather data by running different algorithms at the same time

    Predicting energy demand peak using M5 model trees

    Get PDF
    Predicting energy demand peak is a key factor for reducing energy demand and electricity bills for commercial customers. Features influencing energy demand are many and complex, such as occupant behaviours and temperature. Feature selection can decrease prediction model complexity without sacriïŹcing performance. In this paper, features were selected based on their multiple linear regression correlation coefficients. This paper discusses the capabilities of M5 model trees in energy demand prediction for commercial buildings. M5 model trees are similar to regression trees; however they are more suitable for continuous prediction problems. The M5 model tree prediction was developed based on a selected feature set including sensor energy demand readings, day of the week, season, humidity, and weather conditions (sunny, rain, etc.). The performance of the M5 model tree was evaluated by comparing it to the support vector regression (SVR) and artificial neural networks (ANN) models. The M5 model tree outperformed the SVR and ANN models with a mean absolute error (MAE) of 8.94 compared to 10.02 and 12.04 for the SVR and ANN models respectively

    Collaborative knowledge as a service applied to the disaster management domain

    Get PDF
    Cloud computing offers services which promise to meet continuously increasing computing demands by using a large number of networked resources. However, data heterogeneity remains a major hurdle for data interoperability and data integration. In this context, a Knowledge as a Service (KaaS) approach has been proposed with the aim of generating knowledge from heterogeneous data and making it available as a service. In this paper, a Collaborative Knowledge as a Service (CKaaS) architecture is proposed, with the objective of satisfying consumer knowledge needs by integrating disparate cloud knowledge through collaboration among distributed KaaS entities. The NIST cloud computing reference architecture is extended by adding a KaaS layer that integrates diverse sources of data stored in a cloud environment. CKaaS implementation is domain-specific; therefore, this paper presents its application to the disaster management domain. A use case demonstrates collaboration of knowledge providers and shows how CKaaS operates with simulation models

    Energy Cost Forecasting for Event Venues

    Get PDF
    Electricity price, consumption, and demand forecasting has been a topic of research interest for a long time. The proliferation of smart meters has created new opportunities in energy prediction. This paper investigates energy cost forecasting in the context of entertainment event-organizing venues, which poses significant difficulty due to fluctuations in energy demand and wholesale electricity prices. The objective is to predict the overall cost of energy consumed during an entertainment event. Predictions are carried out separately for each event category and feature selection is used to select the most effective combination of event attributes for each category. Three machine learning approaches are considered: k-nearest neighbor (KNN) regression, support vector regression (SVR) and neural networks (NN). These approaches are evaluated on a case study involving a large event venue in Southern Ontario. In terms of prediction accuracy, KNN regression achieved the lowest average error. Error rates varied greatly among different event categories

    CEPSim: Modelling and Simulation of Complex Event Processing Systems in Cloud Environments

    Get PDF
    The emergence of Big Data has had profound impacts on how data are stored and processed. As technologies created to process continuous streams of data with low latency, Complex Event Processing (CEP) and Stream Processing (SP) have often been related to the Big Data velocity dimension and used in this context. Many modern CEP and SP systems leverage cloud environments to provide the low latency and scalability required by Big Data applications, yet validating these systems at the required scale is a research problem per se. Cloud computing simulators have been used as a tool to facilitate reproducible and repeatable experiments in clouds. Nevertheless, existing simulators are mostly based on simple application and simulation models that are not appropriate for CEP or for SP. This article presents CEPSim, a simulator for CEP and SP systems in cloud environments. CEPSim proposes a query model based on Directed Acyclic Graphs (DAGs) and introduces a simulation algorithm based on a novel abstraction called event sets. CEPSim is highly customizable and can be used to analyze the performance and scalability of user-defined queries and to evaluate the effects of various query processing strategies. Experimental results show that CEPSim can simulate existing systems in large Big Data scenarios with accuracy and precision

    Knowledge as a Service Framework for Disaster Data Management

    Get PDF
    Each year, a number of natural disasters strike across the globe, killing hundreds and causing billions of dollars in property and infrastructure damage. Minimizing the impact of disasters is imperative in today’s society. As the capabilities of software and hardware evolve, so does the role of information and communication technology in disaster mitigation, preparation, response, and recovery. A large quantity of disaster-related data is available, including response plans, records of previous incidents, simulation data, social media data, and Web sites. However, current data management solutions offer few or no integration capabilities. Moreover, recent advances in cloud computing, big data, and NoSQL open the door for new solutions in disaster data management. In this paper, a Knowledge as a Service (KaaS) framework is proposed for disaster cloud data management (Disaster-CDM), with the objectives of 1) storing large amounts of disaster-related data from diverse sources, 2) facilitating search, and 3) supporting their interoperability and integration. Data are stored in a cloud environment using a combination of relational and NoSQL databases. The case study presented in this paper illustrates the use of Disaster-CDM on an example of simulation models

    Federated Critical Infrastructure Simulators: Towards Ontologies for Support of Collaboration

    Get PDF
    Our society relies greatly on a variety of critical infrastructures (CI), such as power system networks, water distribution, oil and natural gas systems, telecommunication networks and others. Interdependency between those systems is high and may result in cascading failures spanning different infrastructures. Behavior of each CI can be observed and analyzed through the use of domain simulators, but this does not account for their interdependency. To explore CI interdependencies, domain simulators need to be integrated in a federation where they can collaborate. This paper explores three different simulators: the EPANET water distribution simulator, the PSCAD power system simulator and the I2Sim infrastructure interdependency simulator. Each simulator’s modeling approach is explored and their similarities and differences between modeling approaches are determined. Core ontology for each simulation engine is created as well as initial mapping between them. Ontologies and their mapping will support collaboration of simulators by enabling exchange of information in a semantic manner

    Energy Forecasting for Event Venues: Big Data and Prediction Accuracy

    Get PDF
    Advances in sensor technologies and the proliferation of smart meters have resulted in an explosion of energy-related data sets. These Big Data have created opportunities for development of new energy services and a promise of better energy management and conservation. Sensor-based energy forecasting has been researched in the context of office buildings, schools, and residential buildings. This paper investigates sensor-based forecasting in the context of event-organizing venues, which present an especially difficult scenario due to large variations in consumption caused by the hosted events. Moreover, the significance of the data set size, specifically the impact of temporal granularity, on energy prediction accuracy is explored. Two machine-learning approaches, neural networks (NN) and support vector regression (SVR), were considered together with three data granularities: daily, hourly, and 15 minutes. The approach has been applied to a large entertainment venue located in Ontario, Canada. Daily data intervals resulted in higher consumption prediction accuracy than hourly or 15-min readings, which can be explained by the inability of the hourly and 15-min models to capture random variations. With daily data, the NN model achieved better accuracy than the SVR; however, with hourly and 15-min data, there was no definitive dominance of one approach over another. Accuracy of daily peak demand prediction was significantly higher than accuracy of consumption prediction
    corecore